-
-
Notifications
You must be signed in to change notification settings - Fork 363
Add v2 and v3 metadata support to codecs #3332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
this is now in a phase where I would really appreciate eyes from @zarr-developers/python-core-devs. The goal of this PR is twofold:
In this PR, when a user shows up with This PR also adds typeddict classes for the v2 and v3 form of each codec, which was laborious but IMO worth it for type safety. If you have time, please look this over and / or test this on your v2 -> v3 workloads. That would be extremely helpful. I think these changes are on the same scale as the data type changes, so this requires a lot of finesse and potentially follow-up PRs. |
I think the last thing I need to do here is write a test that ensures compatibility between these changes and older versions of zarr python 3.x. |
I am concerned that the additions to the codec API in this PR will be disruptive to people who implemented custom Zarr V3 codecs, e.g.., anyone who defined a class that inherited from in |
yeah currently in virtual-tiff I get lots of failures (119 failed, 317 passed) with most due to I think this would probably be an issue for other parsers too, such as gribberish and Sean's new HRRRarser. The VirtualiZarr tests also fail with errors such as: |
This is super useful feedback. I'll add virtual-tiff as a dev dependency while I work out how to make these changes non-breaking. |
More context for this: >>> from numcodecs.zarr3 import Zlib as ZlibV3
>>> from numcodecs import Zlib
>>> Zlib().get_config()
{'id': 'zlib', 'level': 1}
>>> ZlibV3().to_dict()
/Users/d-v-b/.cache/uv/archive-v0/RPIFUeEX8IUCTWZnqf1cL/lib/python3.12/site-packages/numcodecs/zarr3.py:164: UserWarning: Numcodecs codecs are not in the Zarr version 3 specification and may not be supported by other zarr implementations.
super().__init__(**codec_config)
{'name': 'numcodecs.zlib', 'configuration': {}}
>>> ZlibV3(fake_param=10).to_dict()
{'name': 'numcodecs.zlib', 'configuration': {'fake_param': 10}} What you see here is a massive flaw in the slapdash design of the codecs in |
This PR will give each codec a v2 and v3 JSON de/serialization routines.
depends on #3318